FAOSTAT 2.3
A revitalisation of the API wrapper of the FAOSTAT API
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec sed sapien dolor. Nulla in consectetur urna. Morbi lacinia ornare massa vel viverra. Vestibulum vel venenatis dolor. Nulla facilisi. Duis suscipit nunc id viverra eleifend. Nullam imperdiet risus malesuada aliquam luctus. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Other formats
HTML report | PDF report | Presentation | GitHub
Project background
The motivation for this project came from a Data Mining project from UniLaSalle. It was suggested that we use FAO1 data from their statistical platform FAOstat2. As R was the language of choice, the obvious port of call was the FAOSTAT package3 (Kao, Gheri, and Gesmann 2022), developed by employees at FAO.
1 Food and Agriculture Organization of the United Nations
2 Food and Agriculture Organization Corporate Statistical Database
3 For the purposes of clarity, this document will use the style “FAOSTAT” for the R package and “FAOstat” for the statistical platform
However, the FAOSTAT package did not work. It could not download data from the API and could only download bulk data with the entirety of a dataset in one go. For the particular dataset we were interested in, we found that there was a discrepancy between the data in the bulk download and the data on the web platform.4
4 This discrepancy has been fixed as of 2023-03-10
Eventually it became necessary to use the same API that the FAOstat website uses to pull data. This method worked and it became clear that it could be used to revitalise the FAOSTAT package and part of an effort to restore it to full functionality.
FAOstat
FAOstat is FAO’s web-based statistical platform for the free dissemination of food and agriculture statistics. This data is obtained from questionnaires that FAO distributes throughout the world every year (FAO 2019). Some of its data also comes from imputations and models where data is not available, but official country data takes precedence.
The FAOstat service is a public-facing aspect of FAO, with an overall trend of increasing citations in academic papers year on year with a peak of 21 400 citations in 2021 (Figure 1).
This platform uses a REST API internally to communicate with its database as well as providing a set of zip files with the entirety of certain datasets in order to reduce the load on the database. This REST API allows the website to generate CSVs as well as to allow exploration of the data via interactive graphs (Figure 2).
FAOSTAT package
The FAOSTAT package is an API wrapper to pull data from FAOSTAT into a R session. It can also perform small necessary tasks such as country code conversion and coalescing data from different country groups.5
5 For example, China may be just the mainland or may include Taiwan (Chinese Taipei), Hong Kong and Macao
History
The FAOSTAT package was originally developed in 2013 as a tool to source data for the SYB6 project. The yearbooks are yearly summaries of the worldwide state of agriculture for that year. At the time, they were manually typeset and compiled. The new SYB project was to use a combination of LaTeX, knitr and R to automatically pull data from FAOSTAT and other data sources such as the World Bank. This data would be then be transformed and processed to create graphs and tables before finally formatting and typesetting to create a finished product which could then be printed.7. Given that this use case no longer exists, the primary use of this package is for researchers and other R users to read data from FAOstat in a clean way that makes it easier to move to analysis afterwards.
6 Statistical Year Book
7 The author has no insight into the current production of the SYB, but they are still being produced and can be found on the FAO website
It is a reasonably popular package in the 86th percentile of all packages on CRAN on 2023-04-01 by downloads. In total, the package has been downloaded over 50 000 times with a peak 121 daily downloads on 2019-05-15. (Li 2023)
The package was maintained by Michael Kao, the author, from 2013 to 2014. In 2014, it was maintained by Filippo Gheri before passing to Paul Rougieux (the current maintainer) in 2020.
While it was originally hosted on Github under Michael Kao’s personal account, It is currently hosted on GitLab under Paul Rougieux’s personal account.
Current state
The FAOSTAT package has only a shadow of its former functionality. While it has retained the ability to download and process zip files and country code processing functions,8 its capacities are limited by the following issues:
8 For a full description of the status of individual issues, please see the GitLab issue #20 Remove functions linked to defunct uses of FAOSTAT
Functionality locked to the Statistical Yearbook
A number of functions are simply designed to pull in data from other sources such as the World Bank and to process that data into a format easily consumed by the Statistical Yearbook. As the yearbook no longer uses the FAOSTAT package, these functions have no further purpose, serving only to clog up the package and its help files.
Functionality powered by local files
Many uses of FAOSTAT require data outside of the data that comes directly from FAOSTAT. The major use case is for code conversions. There are two main code types that require conversion:
- Country codes
- FAO FAO’s internal codes for countries9
- M49 The UN standard country codes
- ISO2 & ISO3
- Item codes
- FAO
- CPC
9 For further details about FAO and how it handles country identification, see FAO’s NOCS database
Change of FAOstat API
- Lots of SYB functions
- Maintained by someone outside of FAO (European Commission)
Project goals
- Fix up core functions
- Transfer maintainership
Methods
Please see issues in milestone: https://gitlab.com/paulrougieux/faostatpackage/-/issues/?sort=created_date&state=closed&milestone_title=2.3.0&first_page_size=20
- Refactor essential functions
- getFAO -> read_fao
- FAOsearch -> search_fao
- Add caching
- Add column metadata
- documentation, examples, tests
Use case
- Check health of zip data
- Display availability of certain country data (shiny app)
- Graph per country
- Colour by flag
Future work
- Release 3.0.0
- Fully integrating with the new API
- Publishing a paper in JOSS
- Publish news in rweekly
- Publish
Funding declaration
This project has been funded by the Food and Agriculture Organization of the United Nations and the author is grateful for their help in reviving it.